Look in right location for new style subdirectories#1209
Look in right location for new style subdirectories#1209aevernon wants to merge 1 commit intokaldi-asr:masterfrom aevernon:master
Conversation
| ln -s $dir/$subdir data/local/data/links | ||
| else | ||
| new_style_subdir=$(echo $subdir | sed s/fe_03_p2_sph/fisher_eng_tr_sp_d/) | ||
| new_style_subdir=$(echo $subdir | sed s/fe_03_p1_sph/fisher_eng_tr_sp_d/) |
|
my experience with a diferent format of the releases was that the best way i.e. using the corpora directory only as a starting point to find say On Tue, Nov 22, 2016 at 5:31 PM, Daniel Povey notifications@github.com
|
|
I mean the command On Tue, Nov 22, 2016 at 6:23 PM, Jan Trmal jtrmal@gmail.com wrote:
|
|
I suspect that your patch is not right. That script takes multiple LDC On Tue, Nov 22, 2016 at 6:24 PM, jtrmal notifications@github.com wrote:
|
|
@aevernon, look at the usage message of the script more carefully. It expects 4 different LDC corpora-- or a single directory where the contents of all of them reside. Closing the PR. |
|
Please consider re-opening this ticket. The details below make me think my patch is correct. Per the usage message, I pass the four directories to Steps to Reproducecd /kaldi/egs/aspire/s5
. ./cmd.sh
. ./path.sh
mfccdir=`pwd`/mfcc
set -e
local/fisher_data_prep.sh /export/corpora3/LDC/LDC2004T19 /export/corpora3/LDC/LDC2005T19 \
/export/corpora3/LDC/LDC2004S13 /export/corpora3/LDC/LDC2005S13Observed Behavior
Contents of my four corpora directories to show I have extracted them correctly: ls /export/corpora3/LDC/LDC2004T19
ls /export/corpora3/LDC/LDC2005T19
ls /export/corpora3/LDC/LDC2004S13
ls /export/corpora3/LDC/LDC2005S13
Determining correct location of fe_03_p1_sph1: find /export/corpora3 -name fe_03_p1_sph1
I agree with @jtrmal that using |
|
Oh OK. I did not realize you were using all 4 directories. I guess what
you have is the 'newer' data, and we have the older data which is why we
didn't see a problem.
I'll merge.
…On Mon, Nov 28, 2016 at 11:14 AM, Albert Vernon ***@***.***> wrote:
Please consider re-opening this ticket. The details below make me think my
patch is correct.
Per the usage message, I pass the four directories to fisher_data_prep.sh:
Steps to Reproduce
cd /kaldi/egs/aspire/s5. ./cmd.sh. ./path.sh
mfccdir=`pwd`/mfccset -elocal/fisher_data_prep.sh /export/corpora3/LDC/LDC2004T19 /export/corpora3/LDC/LDC2005T19 \
/export/corpora3/LDC/LDC2004S13 /export/corpora3/LDC/LDC2005S13
Observed Behavior
local/fisher_data_prep.sh: could not find the subdirectory fe_03_p1_sph1
in any of /export/corpora3/LDC/LDC2004T19 /export/corpora3/LDC/LDC2005T19
/export/corpora3/LDC/LDC2004S13 /export/corpora3/LDC/LDC2005S13
*Contents of my four corpora directories to show I have extracted them
correctly:*
ls /export/corpora3/LDC/LDC2004T19
fe_03_p1_tran
ls /export/corpora3/LDC/LDC2005T19
fe_03_p2_tran
ls /export/corpora3/LDC/LDC2004S13
fisher_eng_tr_sp_d1 fisher_eng_tr_sp_d3 fisher_eng_tr_sp_d5
fisher_eng_tr_sp_d7
fisher_eng_tr_sp_d2 fisher_eng_tr_sp_d4 fisher_eng_tr_sp_d6
ls /export/corpora3/LDC/LDC2005S13
fe_03_p2_sph1 fe_03_p2_sph2 fe_03_p2_sph3 fe_03_p2_sph4 fe_03_p2_sph5
fe_03_p2_sph6 fe_03_p2_sph7
*Determining correct location of fe_03_p1_sph1*:
find /export/corpora3 -name fe_03_p1_sph1
/export/corpora3/LDC/LDC2004S13/fisher_eng_tr_sp_d1/fe_03_p1_sph1
find shows that the patch is correct (at least for LDC data that I
downloaded this month.)
I agree with @jtrmal <https://github.com/jtrmal> that using find would be
more robust. The intention of this patch was to be a quick fix for others
who might try to run this recipe.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#1209 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu1jnBER20zCS2bv17WoxcxYrpfH6ks5rCv3hgaJpZM4K6BGD>
.
|
|
reopening... |
|
Oh, I can't reopen because you deleted the repo that the PR was pointing to. Is it possible to recreate that repo? |
|
I've re-created the repository. |
|
github still won't let me reopen. Would you mind creating a new PR? I'll merge right away. |
|
Created as #1223. |
I tested this patch against
fisher_eng_tr_sp_LDC2004S13.zip, which I downloaded from LDC today.It looks like original committer mistyped the part number since LDC describes this dataset as "Fisher English Training Speech Data, Part 1."